This paper examines the use of a residual bootstrap for bias correction inmachine learning regression methods. Accounting for bias is an importantobstacle in recent efforts to develop statistical inference for machinelearning methods. We demonstrate empirically that the proposed bootstrap biascorrection can lead to substantial improvements in both bias and predictiveaccuracy. In the context of ensembles of trees, we show that this correctioncan be approximated at only double the cost of training the original ensemblewithout introducing additional variance. Our method is shown to improvetest-set accuracy over random forests by up to 70\% on example problems fromthe UCI repository.
展开▼